What is behind the gender gap in economics distance education: Age, work-life balance and COVID-19

There is an ongoing debate about whether gender equality in education has been achieved or not. Research efforts have focused on primary and secondary education, while there are fewer studies on higher education, and few studies refer to distance education. To contribute to address this gap, this article presents a gender analysis of educational outcomes in economics at Spain’s leading distance university, UNED, which is also the largest university in the European Union in terms of enrolment. The aim of the article is to assess whether there is a gender gap in academic results and to identify the sociodemographic and academic variables that may be causing such a gap by analysing how they shape such differences. Finally, the impact of COVID-19 is also considered. The results confirm that women underperformed significantly in our sample in terms of passing and scoring, especially among those between 30 and 45 years of age, who are more likely to have young children. When considering a distribution of family tasks biased against women, along with the higher average age of distance learning university students, gender gaps could probably be greater in nonface-to-face education. COVID-19 narrowed the gender gap during the lockdown period, as some men and women staying at home together were able to improve task sharing capabilities. After the lockdown, however, women’s results worsened compared to pre-COVID-19 levels. A possible explanation is that they had to continue performing the same family duties in addition to substituting education and caring services (e.g., nurseries and day centres for the elderly) that did not resume activity immediately or continuously.


Introduction
There is an ongoing debate about whether gender equality in education has been achieved. Although men seem to perform better on standardized aptitude and achievement tests, women outperform men in scores on a whole subject or academic year [1][2][3]. Voyer & Voyer (2014), through a meta-analysis [2], found evidence suggesting that women perform slightly better than men at all stages of the education system and in all branches. However, there are so many exceptions and nuances to this evidence that they preclude reaching an unequivocal

Outlook and literature review
Although the debate regarding gender equality in education is ongoing, some researchers support the hypothesis that women generally outperform men in education results from elementary school to the university level [2]. However, not all women outperform: nationality, age, income and other factors may have a strong influence. UNESCO measured and recognized both facts in primary and secondary education [25], but gender intersectionality data are not systematically available for higher education. Voyer & Voyer [2] point out in their meta-analysis that this female advantage is small and that the nationality and gender composition of samples are significant moderators of effect sizes. Although women were more likely to obtain better scores, this trend did not occur at all ages or in all subjects [26].
In 2018, women accounted for more than half of 17.5 million tertiary students in the EU-27 [27]. Women are the majority in most fields except for some branches of STEM [4]. However, gender equality is measured not only by access but also by results, as occurs in the labour market. There is not a systematic measurement of higher education outcomes at the global level, although the EU and national systems are developing data disaggregated by sex [28,29]. Within the EU-27, almost 60% of all graduates in 2018 were women, and graduates' distribution in fields of education by sex are similar to those for enrolled students. Similarly, women tend to obtain slightly better scores. In Spain, the average scores for a bachelor's degree in the 2019/20 academic year were 7.39 for women and 7.09 for men, similar to previous years [29]. The same occurs in the social sciences and economics, where our focus lies. Beyond scores, perceived and observed learning outcomes did not differ significantly by sex [30,31].
Regarding e-learning, there are few differences between male and female students in their enrolment, motivation and satisfaction patterns, mainly regarding their interactions with technology [32]. There are no gender differences in student satisfaction among millennials [33], but beyond this age range, there is mixed evidence [31,32,34,35]. Lu & Chiou [35] point out that gender and job status significantly influence students' satisfaction and their perceptions about interface usability, community membership, content richness and flexibility. However, according to these authors, only job status and learning styles moderate the relationship between previous perceptions and satisfaction, while gender has no effect in e-learning environments.
Comparisons of online and on-campus settings show statistically significantly better online learning outcomes than traditional learning results [31]. Women were found to outperform men in academic achievement both online and on campus [36]. It seems that different motivational factors (self-determination, culture or study as a hobby, job opportunities and promotion, etc.) are influenced by sex and age, which are reflected in educational outcomes [37]. The positive relationship between sex and academic performance is explained when students are older, but deep learning is more likely deployed by older women [36]. Age itself does not predict adult students' learning satisfaction and performance [38], but when women underperform, Richardson et al. [24] suggest that this may be because, within a particular age range, women usually combine occupational and domestic responsibilities.
However, some variables are missing from the literature on gender gaps in education. When nationality has been traditionally included in educational research, it has been mainly done for the analysis of the integration of immigrants in primary or secondary schools. On many occasions, students' sex is also considered in analyses. However, when taking into account the age difference between average UNED and university students, our subjects of analysis are mainly adult individuals who work and study outside their countries of origin.
Including this variable in the distance learning context could be an interesting novel approach, as we have not found references that deal with context.
Terms represent another interesting variable not explored in the literature. Perhaps this may be because in face-to-face education, with most students engaged in full-time study, the differences between terms are not significant. However, when students combine study, work and family life, as is the case in nonface-to-face education, fatigue may be a relevant factor. Our exploratory analysis shows significant differences between semesters, leading to their inclusion in the analysis.

COVID-19, higher education and domestic and care work: Does sex matter?
Over the course of the COVID-19 lockdown, tertiary students were forced to shift to online education worldwide. Previous research has focused on how the initial movement to online teaching among face-to-face universities due to measures against COVID-19 has widened socioeconomic educational gaps, even leading to widening inequality and increasing poverty in countries such as the US, Italy, Sweden and Turkey [8,9,11,12]. Although many use sex as a control variable [8,11,12], these articles are not focused in the gender gap and gender intersectionality is not even considered. In contrast, Casalone et al. [10] apply a gender perspective while drawing national comparisons, bringing intersectionality into gender analysis. Age, sex, ethnicity, nationality and income level are among the key factors considered in the literature related to the effects of COVID-19 on tertiary education.
In Spain, passing rates and scores seem to have improved. Gonzalez et al. [39] attribute better student performance to a general change in the autonomous learning process, as assessment activities and learning methodologies cannot explain it. Students work with more adequate time management, creating positive results. On the other hand, students from families with a low educational level had fewer opportunities to use digital technologies during the COVID-19 lockdown [40]. Adaptation to online higher education depends on a set of factors, including institutional and pedagogical responses, individual self-regulatory and socioemotional competencies, and adequate resources [41].
However, COVID-19 has also affected higher education distance learning institutions. UNED had to shift examinations from face-to-face to online formats, which were temporarily maintained postlockdown until September 2021. The lockdown may have provided more time to study and improved results, as evidence suggests for Turkey and Italy [10]. In addition, men stayed at home more so they could carry out more tasks than usual [21]. However, we do not know if these changes are enough to compensate for difficulties caused by the disruption of the whole education system, from daycare and preschool centres to primary and secondary schools, which forced babies and children to stay at home. International comparisons suggest that forced isolation did not lead young women to neglect their studies to dedicate time to housework or childcare among those enrolled at on-campus universities that switched to online methods due to the pandemic [10]. However, the results may be not similar when most students are middle -aged, as is the case at UNED.
To our knowledge, the gender perspective has not been included in the analysis of COVID-19 effects on Spanish academic results, even when the sex variable has been available [39,40]. Hence, this t analysis aims to partially overcome this gap in the literature regarding distance education. UNED academic results are analysed by considering the sex variable systematically with other factors that may contribute to explaining the gender differences observed. In particular, this research assesses whether the lockdown and the switch to online exams produced significant changes in students' results and induced additional gender differences. Moreover, given the age and labour status of UNED students, it is relevant to consider the gender inequalities observed before and during the COVID-19 crisis related to the distribution of paid and unpaid work. As with that focused on higher education, the literature on the effects of the COVID-19 crisis on the labour market, care and gender inequality [16-20, 22, 23] has mainly focused on the first months of this declaration, when the restrictions were the most severe. Among initial effects have been the large percentages of the employed population not working, a phenomenon more widespread among the less educated and blue-collar workers, and those working from home, especially among university-educated employees and whitecollar workers, to a greater or lesser extent depending on the national context [20]. Similarly, an increase in the gender gap in total working hours is observed because the decrease in the number of hours of paid work does not compensate for the increase in time spent on unpaid work and the previous difference between men and women [19]. Women continued to bear a greater share of the burden of domestic work [16,18,19,23], although men's physical presence in the household helped slightly increase their participation in domestic and care work [16,17,23,42]. This increased burden of unpaid work by women, even in an exceptional and critical situation such as the lockdown suffered in spring 2020, is independent of their labour market status [19,22]. Men increased their involvement in domestic and care work, but the increased assumption of responsibilities did not lead to a generalized reduction of the gender gap in unpaid work [17-19, 22, 23], as the absence of educational services and the impossibility of outsourcing domestic and care work meant that many women also increased their participation in unpaid work.
This picture of gender inequality in paid and unpaid work remained consistent at least until June 2021 in Spain [21], where many conditions have been maintained to a greater or lesser extent from the lockdown to the present. In Spain, the second and subsequent waves of COVID-19 brought states of alarm until June 2021, which also entailed severe restrictions on mobility and contact with noncohabitants, maintaining strong encouragement to work from home, as well as partial or occasional restrictions in many educational centres, although educational services were theoretically resumed. At UNED, for example, exams remained online for 2020-2021.

Ethical approach to participants
In November 2021, UNED approved two nonfunded teaching innovation projects named "Inclusion of the gender perspective in teaching innovation" and "Improvements for academic performance in economic policy subjects after COVID-19" coordinated by two of the authors. These two projects form the basis of this article.
Although this research involved human participants, the study did not require ethics committee authorization according to UNED's internal protocols, as it meets the three requirements established for this approval not to be needed: • The research does not affect the fundamental rights (life, physical/psychic integrity, health, freedom/autonomy in any of its manifestations, personal dignity, etc.) of the subjects involved.
• Only nonidentifying personal data are used.
• Only data available for researchers on the quality of teachers are used.
The Vice-Chancellor for Research, Knowledge Transfer and Scientific Dissemination, who is also the president of the UNED ethics committee, certified this fact. Data were anonymized by a random transformation of student IDs. Data collection did not include minors, and no personal data were gathered other than data on sex, age and nationality. These data were collected during the enrolment process, and students were informed that they may be used for the university's purposes, including research. The participants may, at any time, exercise their rights of access, rectification, cancellation or opposition of their data, but they have not done so thus far. This research did not interfere with the activities, processes or assessments of the participating students in any way. Moreover, the teachers responsible for the subjects included in the study gave their verbal consent to carry out this research.

Gender equality in Spanish higher education: The UNED case
In Spain, women in higher education follow European trends. Women were the majority in undergraduate, master's and doctoral studies in the 2019/20 academic year [29]. In the social sciences and law, women accounted for 60% of those enrolled and even 65% of those who finished their bachelor's degree [43]. However, the student profile is clearly different for face-toface universities than for nonface-to-face universities. For instance, in the social and legal sciences, in the 2018/19 academic year, almost 81% of enrolled students were under 25 years of age, and only 5% were over 30 years of age, while at UNED, only 16.04% of the students were under 25 [44]. Thus, it cannot be assumed that gender equality or gender gaps behave in the same way. Consequently, it is also interesting to analyse gender equality in nonface-to-face education.
Distance universities in Spain had 264,857 students enrolled, representing 16.2% of university students, in 2019/20 [29]. UNED is Spain's national nonface-to-face public university and is the oldest distance education university in the country: in 2022, it celebrated its 50th anniversary. Since 2013, it has been the largest university in the EU by enrolment, both face-to-face and distance [5,6]. In Spain, educational competencies are transferred to regions (autonomous communities), but UNED was founded in 1972, before the birth of autonomous communities, and has succeeded in remaining a national institution. In addition to UNED, there is only another national university, Menendez Pelayo International University, but it does not teach undergraduate courses. In the 2019/20 academic year, 157,418 students were enrolled at UNED [44], representing almost 60% of Spanish nonface-to face students. A total of 53.7% were women, and this level rose to 54.9% in 2020/21 [44]. From these numbers, it is clear that UNED students are a representative sample of distance education in Spain. Table 1 shows the gendered learning outcomes gap at UNED.
As shown in Table 1, across UNED, women are enrolled in more subjects (55%) than men, and they tend to take exams in higher proportions (5.7% points-p.p.-more than men), but they pass at lower proportions (2.2 p.p.) although their scores are similar (slightly lower by 0.07). This trend is found in most fields with some exceptions. Education and STEM, which are the most female-and male-dominated subjects, respectively, present almost no gender gap in passing grades and scores. Psychology, economics and business, and political science and sociology present a higher gender gap in passing grades than arts and the humanities, STEM and even the social sciences as a whole. Therefore, is there a gender bias in these faculties?
To answer this question, the article focuses on the UNED Faculty of Economics and Business and, specifically, on a set of applied economics subjects described in the next section. This faculty and the examined subjects were chosen for several reasons. First, the faculty seems to have one of the largest gender gaps regarding enrolment, passing grades and scores of the university (see Table 1). While women tend to be the minority in economics departments [45,46], more evidence is needed on the learning outcomes of gender gaps in economics. Second, the chosen subjects are drawn from similar fields of knowledge (economic policy and public finance), and have a similar evaluation system that facilitates the comparability of results. Third, none of the subjects are first-year subjects, and all but one are mandatory, which means that all students should have a basic knowledge of economics before enrolling and that all students must pass them to earn the degree. Finally, the authors of this article teach these subjects and, therefore, are allowed to access all academic and nonacademic data of their students in accordance with the protocols of the UNED ethics committee (see previous section).
In summary, considering the particularities of UNED, this research could be valuable for several reasons. First, the results do not necessarily correspond to the national average educational outcomes by sex, as our data are from a specific field of knowledge (applied economics) and from a nonface-to-face university. Second, UNED students are not necessarily young people who follow a specific training path but mostly adults with work and family responsibilities. In addition, the factors inherent to the UNED learning/teaching process could affect their results. UNED students choose this university because it affords them the opportunity to combine their studies with family (75%) or work obligations (87.8%) [47]. Thus, as middle-aged women usually assume more family responsibilities than their male counterparts, they are expected to show worse results. In contrast, deep learning-the attempt to understand the meaning and logic of course materials and its arguments and the opposite of memorization-is usually related to older students, although it is mediated by sex [36], so this can also be analysed. Third, although UNED is a distance learning university, it has been also been affected by the COVID-19 lockdown. The gender perspective has not yet been included in the analysis of COVID-19 effects on Spanish academic results.

Data and variables
The sample includes 7,477 UNED students of the Faculty of Economics Science and Business from the 2016/2017 to 2020/2021 academic years. The unit of analysis is enrolment by subject and student. The students in the sample applied for a total of 16,821 enrolments. The nine subjects selected are taught in only one semester, as is the most common in degrees at UNED. Subjects belong to two fields, economic policy (five subjects) and public finance (four subjects), taught in four degrees: economics (68%), business administration (20,6%), tourism (10,3%) and political sciences (1,1%). Table 2 shows the main academic features of the subjects, including their enrolment levels by academic year and the average percentage of women

PLOS ONE
enrolled. As expected, the sample is male dominated (62.5% of students are men), except in the tourism program, for which the shares are reversed (67.5% women). Table 3 shows descriptions and measurements of the eleven variables, dependents and independents, used in the models. Data were collected exclusively from three different UNED internal sources: • The dependent variables, which measure academic performance (evaluated, passed and scored), were obtained from the administrative database of the university.
• Most of the variables (sex, age, nationality, term and degree) were obtained from the enrolment process.
• The last three variables were collected from the university online learning platform (continuous assessment-CA-test and, forum messages) or constructed (COVID-19) by the authors.
Most of the variables shown in Table 3 are self-explanatory or are clearly specified in the table. However, some need further explanation to put them in the context of UNED and, in general, of nonface-to-face higher education.
All three dependent variables (evaluated, passed and score) measure learning outcomes and are commonly used in the literature. For instance, evaluated and passed are similar variables to withdrawal and failure rates, while scoring or grading is almost always considered, although the scoring systems used can vary [48]. The three variables are related somehow to exams. At UNED, final ordinary exams on each subject are taken at the end of the term (February and June), but there is also a major examination period for both terms in September. However, we

PLOS ONE
focus on regular university terms, as their learning outcomes cover most of the academic year, they occur closer to the teaching period, and taking them into account together with the extraordinary results would make the analysis excessively complex. Once students are enrolled in a subject, they can either take the final exam and be evaluated correspondingly or not. They have a maximum of six opportunities in total to pass each subject. Thus, many of them do not take an exam even if they are enrolled if they believe that they are not sufficiently prepared or are unlikely to pass. In the end, when students do not take their exams, this is a failure and a waste of resources, both private and public, although we have no data to analyse the reasons this occurs (which may be related to work, personal reasons, or, medical or academic factors). An exam is scored on a scale of 0 to 10. When voluntary CA test is taken, its score (of between 0 and 1) is added to the exam grade. When overall score of 5 or higher is achieved, the student has passed the subject. Regarding "terms" or semesters, Spanish university students traditionally enrol for a full academic year from October to September, including subjects taught in the first (October-February) and second (February-June) terms. Until recently, students could not freely enrol in February, as is found in other education systems. Even today, enrolling in February at UNED is conditional on having registered for a minimum number of credits or subjects in the first term. With the exception of official internships, all subjects at the UNED Faculty of Economics are taught independently in just one term. Subjects taught in the upper years are usually more complex than those taught in the first years, but, within the same year, the level of difficulty of the terms is similar. At UNED, most students enrol in second-term subjects not knowing their first-term results; they combine study, work and family, as is the case in nonface-to-face

PLOS ONE
education, and finally, the Christmas holidays occur immediately before the first-term final exams. In contrast, in the second term, there are no days off before final exams. These three factors together could explain the worsening of learning outcomes in the second term, as disappointment and/or fatigue may affect the students. The difference between terms could be seen as a first indication of the higher drop-out rates of nonface-to-face higher education detected in the literature [49,50]. When the COVID-19 lockdown was established in Spain (March 15, 2020 to June 21, 2020), distance learning centres such as UNED did not have to completely change their learning methodologies, although some adjustments had to be made. For instance, at UNED, noncompulsory, complementary face-to-face group mentoring and face-to-face examinations went online. Additionally, the change in the way exams were taken also led to a loss of flexibility in combining studies with other responsibilities, such as work or family. Before COVID-19, exams for each subject were held at UNED centres across the country and abroad in morning and evening shifts in two different weeks. Students could opt for either shift without having to ask or inform anyone. When the university was forced to switch to online exams, students were assigned by surname to one of the shifts so as not to overload university servers. Although a change in shift could be requested, there had to be a justification, and this had to be done in advance, eliminating the flexibility to adapt to unforeseen events that are common when caring for family members. Therefore, UNED students have to adapt to the lockdown measures, not only outside but also within the university, and it is relevant to study whether gender issues affected this adaptation through a comparison of the pre-COVID-19, lockdown and postlockdown periods. Table 4 presents descriptive statistics of both the dependent and independent variables. They are separated into different panels according to the type of variable (categorical in Panel I and continuous in Panel II). Most students did not take the exams and were not evaluated. Although women were evaluated in a slightly higher proportion than men (49.5% vs. 47.4%), they had worse academic results, as 42% of women did not pass in comparison to 33.3% of men, and women had an average score of 4.81 while men had an average score of 5.3.students are older than the national average, as explained in subsection 3.2. Only 4.3% of the students are foreigners, enrolment is balanced between the terms. Men and women participated almost equally in the CA test and in forums, although most of them dod not take the CA test (71.2%), and the average number of messages by a student was very low (0.38). The correlations table, which shows no relevant correlations, is presented in Table 5.

Models
The method chosen to estimate the models is contingent on the dependent variable. On the one hand, if the dependent variable is binary (evaluated and passed), we use a Logit model. Logit models were estimated by maximum likelihood. This method estimates the parameters maximizing the probability of obtaining the observed data [51]. In these cases, the effect of the independent variables is provided, including log-odds or logits and odds ratios. The. T percentage change in the odds is also included in the supplementary information. On the other hand, for the continuous dependent variable score, we estimate an ordinal least squares (OLS) multiple linear regression model.
Since our data are cross-sectional (students do not remain in all the courses in the sample), but students can enrol several times in one subject or in more than one subjetc at the same time, we used population-averaged (pooled) models to estimate the parameters [52]. In both logit and regression models, the standard errors are corrected with cluster-robust standard errors by student ID to control for heteroskedasticity [53], since a student can enrol in one subject several times. In addition, for each categorical variable, we set a base category, the one with the greater sample size (men, Spanish, the first term, economics, no CA test and the pre-COVID-19 period), as a reference for interpreting the log odds, the odds ratio, and the coefficients of categorical variables. This is why some values, the base categories, do not appear in all the tables or figures. The objectives of this research are twofold: first, we seek to confirm or reject the existence of a gender gap in higher education in economics at the student level at a distance university, and second, we wish to identify how the independent variables shape differences between men and women in the dependent variables. This is why we include interactions in the design of the models. Interactions model how the coefficient for one variable differs according to the values of another variable [54]. In our research, we hypothesize that the effect of sex on the dependent variables varies depending on up to four independent variables (age, nationality, term and COVID-19). Interactions are included in the models by computing the product of two independent variables, which are also included as main effects. For example, a model including the interaction between sex and terms would include the two main effects (sex and terms separately) as well as the interaction (sex � term).

PLOS ONE
To address potential multicollinearity problems, the eight independent variables are divided into three groups: sociodemographic, academic and COVID-19. The first group of sociodemographic variables includes sex, age and nationality (Spanish or foreigner). The second group includes the academic variables: term (first or second term), degree (the four aforementioned degrees), continuous assessment test and student messages in forums. For each dependent variable, three nested models are estimated following a nested estimate procedure [55], which allows for detecting and reducing multicollinearity problems between the independent variables. The first models include only the three sociodemographic variables. The second models, or whole-single models, include all the independent variables but only their main effects. Finally, the third models, or whole models with interactions, include the eight dependent variables, main effects, and interactions of key dependent variables. The three nested models can be outlined as follows: 1. Dependent variables ¼ f ðsociodemographic variablesÞ.

Dependent variables ¼ f ðsociodemographic variables þ main effects of independent variables þ interaction of key dependent variablesÞ:
The mathematical formula of the estimated models is shown in Eq 1.
We ease the interpretation of the results by providing predictive margins for the interactions included in the models. A predictive margin is a postestimation statistic computed from predictions made from a model while some of the covariates are not fixed [56]. The contrast of predictive margins is a postestimation test that measures and tests the significance of the differences between the predictive margins of men and women. All analyses were conducted in Stata statistical software [57].

Results
Fig 1 shows a graphical summary of the results of the estimations for the three whole models with interactions. In this figure, all the sociodemographic, academic and COVID-19 variables' main effects are significant in the three whole models with interactions, with the exception of sex (women) in the evaluated model, nationality in the passed and score models and the political science degree in the evaluated and score models. Regarding interactions, women interactions with age and nationality are not significant in the passed and score models; interactions with the lockdown are not significant in any of the three models, and interactions with terms and the postlockdown period are not significant in the evaluated model. To facilitate the interpretation of the results, the analysis is limited to the whole model with interactions, separating main effects, interactions and COVID-19. However, in the supporting information section, the complete set of nested models is shown in S1-S3 Tables.
To evaluate the performance of the two whole Logit models with interactions, we estimated their receiver operating characteristic (ROC) curves (Fig 2). The ROC curve plots sensitivity (true positive rate) against 1specifity (false-positive rate). The area under the ROC curve (AUC) provides an overall measure of the fit of the model, with values ranging from 0.5 (no discrimination power) to 1 (perfect discrimination). The evaluated model shows greater

PLOS ONE
sensitivity than the passed model, with the first model presents a higher AUC value (0.760) than the second (0.674).

Main effects
The main effects results are shown in Table 6. Regarding sociodemographic variables, age is a statistically significant variable for the three whole models, although the effect is small (the

PLOS ONE
What is behind the gender gap in economics' distance education: Age, work-life balance and COVID-19 odds ratio and coefficient are significant, but the value is very close to zero). The nationality odds ratio is statistically significant only in the evaluated model. The odds of evaluation decrease by a factor of 23.4% for foreigners. For sex, the main variable of this article, Table 6 shows that gender differences are not statistically significant in regard to taking exams (evaluated model), but women underperform significantly, as they passed less often than men and had lower scores. Women's odds of passing decrease by 41.8%, and they are predicted to score 0.552 fewer points (5.52% as the score ranges from 0 to 10) than men. Therefore, in relation to our first objective, we can confirm that women underperform significantly in our sample, especially in terms of the likelihood of passing exams. Regarding academic variables, the term, the field of the bachelor's degree and the participation variables (CA test and messages) are statistically relevant to predicting the odds ratio of being evaluated and passing and the final score. Moreover, relevant effect sizes are registered.

PLOS ONE
The term in which the subject is taught is statistically significantly related to being evaluated, to passing and to the obtained score. Academic results are expected to be lower in the second term. Students reduce their odds of being evaluated by 44.5%, and these odds decrease by 23.3% in the second term. The third model shows that scores of the second term are lower than those of the first term by -0.481 points.
Regarding degrees, the odds of evaluation are predicted to increase 1.278 times in business administration and 2.292 times for tourism in comparison to economics. However, students of business administration have lower odds of passing an exam (-36.8%) than students of economics, and their scores are expected to be lower by 0.559, as shown in Table 6. Students of tourism and political science show increased odds of passing (28.3% and 67%, respectively) relative to students of economics, but only in tourism is the score expected to be higher (0.243 points).
The participation variables (CA test and messages in forums) are statistically significant in all models as expected. Participation means that students actively follow the course and, consequently, the probabilities of taking an exam, passing and obtaining a good grade are much higher than those who do not. Since taking a CA test contributes to one's final grade, the increase in the odds of any dependent variable is greater than that of participation in forums (messages).
COVID-19 restrictions have also impacted academic results. Both the lockdown period and subsequent movement restrictions positively and significantly impacted being evaluated with a relevant size effect (152.5% odds increase for the lockdown period and 154.1% odds increase for the postlockdown period). However, higher odds of passing (change in odds by 26.7%) and achieving a higher score (of 0.474 points) are predicted only for the lockdown period. After the lockdown periods, when some movement restrictions were still in place, academic results were expected to be worse due to the decrease in odds of passing (reduced by 38.6%) and lower scores (a reduction of 0.476 points) than in the period prior to the COVID-19 pandemic.

Intersectionality: Heterogeneity analysis within the gender perspective
Having dealt with the main objective of the article in the previous subsection, this subsection analyses not only how sex affects academic results per se but also how it affects such results considering its interaction with other variables to carry out a gender intersectional analysis. The relation between sex and the COVID-19 situation is treated separately. Finally, due to statistical parsimony, some interactions (degrees, CA testing and messages) were excluded in the definitive model. Fig 3 shows graphically the results of the gender intersectional analysis of the three whole models with interactions in Panels A (sex and age), B (sex and nationality) and C (sex and term), while Table 7 shows the quantitative results of the analysis.
Graphically, there are significant differences if the standard deviation error bars of the categories of the sex variable do not overlap. As can be seen in Fig 3, there are significant sex differences for some ages in the three models of the sex and age interaction (Fig 3, A1-A3), for Spaniards in the passed and score models of the sex and nationality interaction (Fig 3, B5 &  B6) and for the first term in the passed and score models of the sex and term interaction (Fig 3,  C8 & C9). Finally, there is also a significant difference for the second term in the sex and term interaction of the score model (Fig 3, C9).
Quantitatively, a significant interaction term enhances, neutralizes or offsets the main effects of the variables included in the interaction. The age main effect shows a slight decrease (0.5%) in the odds of being evaluated. As women age, their odds of evaluating also decrease by 1.2%, as the increasing gap between men and women shows in Fig 3, A1. Women who take their exams in the second semester have 35.9% higher odds of passing once the negative main effects of being a woman (41.8% rate of decrease in odds of passing) and taking an exam in the second semester (23.3% rate of decrease in odds of passing) are considered. Thus, in this case, the interaction term partially offsets the main effects since the interaction increases the odds of passing and the main effects decrease them. This explains why the difference between men and women in the second term is smaller than that of the first term in Fig 3, C8. The same occurs for foreign women; the interaction term partially offsets the main effects, reducing the gap between men and women (Fig 3, B5).
Considering sex and age jointly, there are relevant differences between men and women in different age groups, especially regarding predictions of passing and scoring, as shown in Fig  3, A2 & A3. As we think that this interaction is the most important one, the differences between men and women are represented graphically in Fig 4 in addition to Table 8. In Fig 4  differences between men and women are significant if the standard deviation error bars do not cross the zero line. Men 40 years of age tended to be evaluated more than their female counterparts, although the difference was small (Fig 3, A1 and Table 8). No statistically significant differences are found under the age of 40. However, men of 20 to 70 years of age are expected to outperform women in terms of passing exams (Fig 3, A2 and Table 8) and scores (Fig 3, A2

PLOS ONE
and Table 8), with greater differences found between the ages of 30 and 45 (Fig 3, A2 & A3 and Table 8).
Regarding sex and nationality (Table 9), there are almost no expected differences between men and women in terms of being evaluated, although Spanish women tend to be evaluated less than men, while foreign-born women tend to have a higher likelihood of being evaluated. Although the contrast shows a statistically significant adverse effect for Spanish women versus Spanish men, the difference in the predictive margins is small in the evaluated model (-0.028) and passed model (-0.088). Regarding the score model, the difference is similar to its main effect (-0.481 points). On the other hand, there is no statistically significant gender difference for foreign-born students regarding passing and scoring, but there is for evaluation, with women exhibiting a small positive difference in this case (0.080).

PLOS ONE
Regarding term and sex jointly (Table 9), there is a slight and statistically significant difference between the predictive margins of being evaluated in the first semester (-0.035). However, no difference is found in the second term. Men are expected to outperform women with higher predictive margins in the first term measured both by passing exams and scores. The difference between predictive margins of passing is statistically significant in the first term (-0.131), while it is not in the second term. Regarding scores, in the first term, women are predicted to obtain fewer 0.655 points than men, whereas their underperformance in the second term is less significant, as it is expected to be -0.241. Table 10 shows the estimation results of the interaction between COVID-19 and sex. Regarding the passed model main effects, as stated above, the odds of passing decreased for women (reduction of 41.8%) as well as for exams done after lockdown (reduction of 38.6%), while these odds increased during the lockdown by 26.7%. Since the odds ratio of the interaction between women and the period after the lockdown has a value of below 1, women's odds of passing further decrease by 23.8% once the negative effects of both main effects are considered, as the increasing gap between men and women shows in Fig 5, E11).

COVID-19 and sex interaction
The results of the score model are similar to those of the passed model. First, both main effects coefficients, women and the period after the lockdown, show a negative effect on the score, while lockdown has a positive effect. Second, the interaction coefficient (women after the lockdown) is negative and significant, evidencing that women after the lockdown exhibited an additional decrease in the score to add to the negative effects of both main effects. Thus, the gap between men and women increased, as Fig 5, E12 shows. Table 11 shows the contrast of prediction regarding COVID and sex interaction. When COVID-19 periods and sex are considered together, small differences found for evaluation before COVID-19 disappear during and after the lockdown period. Before COVID-19, the difference between the prediction of being evaluated is significant (-0.029), but after the event, there are no significant differences. In contrast, while during the COVID-19 lockdown period, differences in predictive means of passing and scores between men and women decrease and are nonsignificant, differences for before and after the lockdown period are significant. However, after the lockdown period, they increase in comparison to pre-COVID-19 levels. Both men's and women's expected results (passing and scores) worsened, but women's underperformance increased and again became statistically significant. Their difference in predictive margins was worse at -0.182, while it was only -0.086 before COVID-19. Moreover, the differences in expected scores were the highest of the whole set of variables studied. Women were expected to obtain 0.992 fewer points than men over the postlockdown period, doubling the pre-COVID-19 difference of -0.429 points.
To up to 60 years of age, women clearly underperform.

PLOS ONE
These results support Richardson et al.'s claim that women's underperformance happens in their central age period [24] and especially in the first term, when the Christmas holiday occurs before exams. As women care for children more than men, when students are more likely to have small children, gender differences clearly impact academic performance, as is observed in the labour market [36]. Since almost half of the students of both sexes in our sample are between 30 and 45 years of age, this asymmetry can have a major impact on women's outcomes. Assuming that male UNED students behave similarly to average Spanish men [37,38], they likely take care of their children less than female UNED students. Accordingly, they have more time in general and especially during school breaks such as the Christmas holiday, which lasts two weeks in Spain. As a result, men can study longer hours and obtain better results, even though both women and men decide to take exams equally. Nonetheless, the Christmas break should have a more positive effect for both sexes in the first term than in the second, as in the latter there is no holiday before exams. In addition, as exams are taken at the end of a term, the accumulated fatigue from caring for children and working and studying could play a role in women's underperformance. This could explain the significance of the variable term.
Furthermore, the fact that not all women underperform strengthens our conclusions: older students, men and women have the best results, and there is no statistically significant gender gap. Among the causes of better results among older students are deeper learning styles [9] but also fewer conflicts between occupational and family responsibilities and education commitments.
The nationality variable for foreign students does not offer significant results, although its presence in the sample (4% of the observations) is limited. On the other hand, the participation variables confirm its importance in terms of passing a course.
Regarding COVID-19, there is some evidence of better results occurring during the lockdown period, but this is not found in the postlockdown period, in line with the literature assuming that such improvement is due to better time management and involvement in subjects [10] and not due to softer standards due to online exams, as this method of evaluation was maintained in the postlockdown period, when academic results worsen consistently, especially for women. The greater underperformance of women in the postlockdown period could be attributed to different causes, such as increased domestic and care responsibilities once people can move more freely. The literature shows that women assumed a higher proportion of tasks due to a lack of education services from September 2020 to June 2021 [21], eventually contributing to the increased difference in the postlockdown period. In summary, the situation in the postlockdown period is far from reflecting the pre-COVID-19 period, which explains the worsening learning outcomes for both sexes and the increasing underperformance of women. Men may return sooner, and they are closer to normality, as they do not bear the burden of household and family chores.
One of the main limitations of the paper lies in a lack of control variables, such as being married and the number of children, used as a proxy of care burdens. Such data were not collected either by the university in the administrative enrolment process or by teaching staff. As a result, these variables are missing from our sample. The main finding of the paper (the underperformance of middle-aged women in distance education in economics) is clearly compatible with the argument that women bear the brunt of domestic and care needs and shows a worsening of this situation due to the lockdown period. However, further research based on ad hoc data should be performed to analytically test this hypothesis.

Conclusions
We found evidence of gender gaps in distance learning in applied economics at UNED. In our sample, women clearly underperform. Age, the field of education, and other academic and structural factors are key variables for understanding where and why men still outperform. This article provides evidence of the existence of gender gaps in higher education on several dimensions.
First, the field of knowledge seems to be relevant to understanding these gaps, as they do not occur homogeneously. Aggregated UNED data show differences between fields of knowledge and degrees, as we have seen in Table 1. Besides, this article finds sound evidence that women underperform in economics, not only in enrolment percentages and the achievement of degrees [45,46], but also in pass rates and scores.
Second, regarding gender intersectionality, age and term variables clearly affect women's results, especially when students are in the central years of life (30)(31)(32)(33)(34)(35)(36)(37)(38)(39)(40)(41)(42)(43)(44)(45). At this age, students usually have work and family responsibilities; the latter are mainly taken on by women, and could probably result in lower academic performance. In addition, women's underperformance is considerably worse in the first term. The two-week Christmas holidays are a possible explanation for this fact as children are at home just one month before exams. When the distribution of family and care tasks is not balanced by sex, men have more time to study, while women have less.
Third, the former conclusion has deep implications for higher education distance learning. As the average distance learning student is usually older than his or her in-class counterpart, gender gaps may appear more easily and be more significant. In fact, these gaps may replicate those of the labour market, as one of the most important underlying factors is the same asymmetry in the distribution of household and family chores.
Fourth, the academic consequences of the COVID-19 period have affected students differently by sex, as previous literature has shown. Over the lockdown period, both sexes slightly improved their results. Nevertheless, after this period, women's results worsened in comparison to those of the pre-COVID-19 period to a greater extent than for men. This could be explained in relation to care and domestic needs as Richardson et al. (1999) suggested two decades ago [24]. During the lockdown period, care needs were higher due to the closing of educational centres and the inability to outsource, and more men were present at home, slightly increasing their participation in meeting these needs. While most of the burden fell on women, who were already doing most of the housework before the lockdown [19], sharing some tasks helped reduce the gender gap in learning outcomes. However, once the lockdown passed, men returned to work as in the pre-COVID-19 period and reduced their presence and participation at home, while care and domestic needs increased due to temporal and partial lockdowns, work from home and the maintenance of the previous gendered distribution of care and household chores [21].
Finally, from a policy perspective, some lessons can be learned. First, online exams are not the cause of higher scores, as UNED maintained this type of evaluation in the postlockdown period, but scores worsened. Thus, the use of online exams could be considered academically acceptable. Second, flexibility regarding dates of exams, which were reduced at UNED over the lockdown and postlockdown periods, may impact differentially by sex. As women usually take on more tasks than men, they need more flexibility; as a result, mechanisms to restore or increase flexibility could reduce the gender gap. In addition, flexibility is not only needed on final test dates; more flexibility related to taking CA tests could also result in less gender bias. Third, continuous assessment does not seem to be the obvious means to accommodate both family and school responsibilities: if mandatory tasks increase beyond CA and final exams, the gender gap in nonface-to face higher education could be widened. Finally, given the greater gender differences in academic results over the first term and higher scores for both sexes relative to the second term, degree programs may consider this information to structure content and time allocation to improve students' results in an ungendered way in the medium and long terms.
Regarding future research, it would be interesting to analyse key factors of the learningteaching process, as well as sociolabour and family conditions, to explain and contrast the causes of the gender gap found in our students. In addition, analyses could be applied to other faculties with similar gender gaps in global educational outcomes, such as psychology, political science and sociology (see Table 1).
However, if women's underperformance is due to less participation from men in care and domestic tasks, this trend should be persistent. Thus, further research on the relation between gender gaps at work or in academic results and family tasks is needed to support policy reforms and measures that increase men's participation in domestic and care needs. The improvement of work-life balance and of men's involvement at home and with family needs would reduce gender gaps not only at home and in the labour market [21] but also in higher education, especially for middle-aged people.
Work-life balance problems related to a lack of time and double shifts are especially relevant for women in university studies [21]. A lack of time to study at a distance university is also likely a gendered issue related to the sexual division of labour, similar to the different gender gaps found in the labour market. Thus, beyond university changes and improvements, structural policy reforms such as universal early childhood education and long-term care services are needed to reduce gender gaps in higher education when students have labour and family responsibilities [21,58].
Similarly, gender gaps in economics learning outcomes seem to point to a more structural issue related to gender inequality in education and society and to how gender imbalances are socially perceived and transmitted. On the one hand, inequality in education is focused on where women are less present, such as the underrepresentation of women in STEM or in the labour market, while the underrepresentation of men in some fields, such as education or care and domestic work, is not systematically considered a problem [59]. On the other hand, generalizations erase the nuances and subtleties of gender inequalities, where gender intersectional analysis is key and brings awareness of the origins of gender imbalances. Women do not perform better or worse than men at all stages of the education system or in all branches [60]. Gender inequality persists in many aspects of education [25], and this article highlights this issue; furthermore, gendered student performance could also be seen as an indicator of a deeper inequality social problem.
Supporting information S1